Problem 2 - Practical HBM

One- and Two-Component Gaussian Mixture Hierarchcial Bayesian Modeling with PyJAGS

Simulating the eccentricity distribution of hot Jupiter exoplanets from the Kepler Mission

LSSTC DSFP Session 4, September 21st, 2017

Author: Megan I. Shabram, PhD, NASA Postdoctoral Program Fellow, mshabram@gmail.com
  1. Use the PyJAGS model and analysis code below to explore a two-component truncated Gaussian mixture model
  2. Using the simulated data code cells below, evaluate the one-component truncated Gaussian generative model simualted data with the two-component Gaussian mixture HBM. What do you notice about the posteriors for the mixture fractions? (Be sure to rename your output files).
  3. Repeat this exersize but now evaluating simulated data from a one-component generative model with a two-component HBM. (Also be sure to rename your output files here as well).
  4. Notebook 3: Using the JAGS model code block for a break-point HBM, set up and evaluate the break-point HBM on the real Kepler eclipsing binary data set provided (some code is provided for analysis, but you may also want to copy code from previous notebooks, and be sure to adjust the number of traceplots and 2d marginals for latent variables.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pyjags
import pickle

from __future__ import division, print_function
from pandas.tools.plotting import *
from matplotlib import rcParams
rcParams["savefig.dpi"] = 100
rcParams["font.size"] = 20
#plt.style.use('ggplot')

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
#%qtconsole

Section 1. below provides code to run one-component h and k data through the one-component HBM, and section 2. provides code to run two-component h and k data through a two-component HBM. After executing and assessing these 2 sections, make a new notebook, or rearrange this notebook in order to run one-compinent data through a two-component HBM, and two-component data through the one-component HBM, (swap the data and HBM combo) to answer Question 2 above. Pay attention to the mixture fractions when running one-component data through the two-component HBM.

1. Generate simulated data sets: one- and two-component truncated Gaussian mixture generative models:


In [3]:
## function to draw from truncated normal, this function will be used for both the 
## one- and two-componenet cases in this workbook.  

def rnorm_bound( Ndata, mu, prec, lower_bound = 0.0, upper_bound = float('Inf')):
    x = np.zeros(Ndata)
    #print(x)
    for i in range(0, Ndata): 
            #print(i)
            while True:
                x[i] = np.random.normal(mu,prec,1)
                if( (x[i]>lower_bound) and (x[i]<upper_bound) ): 
                    break
    return x;

In [4]:
## Check the output to make sure the function is outputting as expected. 
#print(np.random.normal(0,0.0001,10))
print(rnorm_bound(10, 0.0, 0.001, lower_bound=-1,upper_bound=1))


[ -1.80719882e-04   3.91699486e-04   1.73292193e-04  -5.09164397e-04
   1.00014241e-03   1.39972357e-03   8.65809575e-04  -1.37250506e-03
   9.29018934e-05  -1.25781229e-03]

Below we are now simulating the distribution of both "$\boldsymbol{e \cos \omega}$" and "$\boldsymbol{e \sin \omega}$", $h$ and $k$ respectively, for hot Jupiter exoplanets from Kepler.


In [5]:
## One-component Gaussian Mixture Simulated Data for both projected eccentricity terms
## Below we designate the population values of our generative model. These are the 
## truths that we should recover if our hiararchical Bayesian model is properly specified 
## and diagnostics have indicated that the simulation has "not not converged". "You can't 
## prove convergence, at best you can fail to prove a failure to converge".

## In this simulated data set, their are 25 planetary systems (with one planet each)
Ndata = 25 
## Here we asign the dispersion of the simulated population to be 0.3, this is 
## the truth we wish to recover 
sigmae = 0.3 
## We approximate the uncertainty for each measurement as normally distributed about a 
## reported measurement point estimate.  
## For the eccentricity distribution for hot Jupiters, the physical models used to derive 
## these produce larger uncertainty in k by a factor of 2. 
sigmahobs = 0.04
sigmakobs = 0.08

h = np.repeat(0.,Ndata)
k = np.repeat(0.,Ndata)
hhat = np.repeat(0.,Ndata)
khat = np.repeat(0.,Ndata)
hhat_sigma  = np.repeat(sigmahobs,Ndata)
khat_sigma  = np.repeat(sigmakobs,Ndata)

#print(khat_sigma)


for i in range(0,Ndata):
    h[i] = rnorm_bound(1,0,sigmae,lower_bound=-1,upper_bound=1)
    lb = -np.sqrt(1-h[i]**2)
    ub = np.sqrt(1-h[i]**2)
    k[i] = rnorm_bound(1,0,sigmae,lower_bound=lb,upper_bound=ub) 
    hhat[i] = rnorm_bound(1,h[i],sigmahobs,lower_bound=-1,upper_bound=1)
    khat[i] = rnorm_bound(1,k[i],sigmakobs,lower_bound=lb,upper_bound=ub)

## Vizualize the true data values, and the simulated measurements:     
print(h, hhat, k, khat)
plt.hist(h)
plt.hist(hhat)


[-0.41393112  0.11627835  0.14173597 -0.23562997 -0.42923673  0.31493326
 -0.44727581  0.07867351  0.18142182  0.05491008 -0.23725838 -0.34968788
  0.63154433 -0.21773651 -0.26725093 -0.10486356  0.26706864 -0.3596925
  0.52535476  0.02782112  0.09455172  0.15646653 -0.45867488 -0.14759476
 -0.23519005] [-0.39282589  0.10398277  0.185119   -0.20702566 -0.43025947  0.2747293
 -0.46360995  0.11549319  0.10651453 -0.01975278 -0.22597231 -0.32721677
  0.66174235 -0.22432501 -0.28507912 -0.10197261  0.26370011 -0.31493765
  0.53479582 -0.00652395  0.03023311  0.20159958 -0.50495863 -0.13666951
 -0.22402531] [ 0.06423514 -0.07986823  0.03107364  0.31301882  0.07586652  0.26847098
  0.15326399  0.06043858 -0.20819372  0.58605335  0.37366381 -0.16299988
 -0.04839924  0.04344782 -0.25254797 -0.32124179  0.03613562  0.24262385
  0.43616172  0.7381449  -0.30428953  0.22602305 -0.06654293  0.0154422
  0.02315476] [ 0.07775903 -0.09135196 -0.03922595  0.33111341  0.20831491  0.27086349
  0.12367224  0.18254924 -0.2939123   0.58856183  0.53354534 -0.26163841
 -0.15992052  0.1236724  -0.29580968 -0.2823584   0.07963913  0.29617906
  0.41995059  0.74046747 -0.14728199  0.39060533 -0.00141368 -0.01302922
  0.05184046]
Out[5]:
(array([ 4.,  3.,  4.,  2.,  3.,  4.,  3.,  0.,  1.,  1.]),
 array([-0.50495863, -0.38828853, -0.27161844, -0.15494834, -0.03827824,
         0.07839186,  0.19506196,  0.31173206,  0.42840216,  0.54507225,
         0.66174235]),
 <a list of 10 Patch objects>)

Below is the one-component Gaussian mixture model in JAGS copied from notebook 1, and gerneralized to also include k in addition to h:


In [6]:
## JAGS user manual: 
## http://www.uvm.edu/~bbeckage/Teaching/DataAnalysis/Manuals/manual.jags.pdf

## JAGS model code

## model code is the one-component truncated gaussian mixture HBM code from previous 
## notebook


code1 = '''

model {
        
    #Population parameters
    e_sigma ~ dunif(0.0, 1.0)
    e_phi <- 1/(e_sigma*e_sigma)
        
    for (n in 1:Ndata){
    
        #True planet properties
        h[n] ~ dnorm(0, e_phi) T(-1,1) #Can try multivariate truncated normal in future
        k[n] ~ dnorm(0, e_phi) T(-sqrt(1-h[n]*h[n]),sqrt(1-h[n]*h[n]))
            
        #Observed planet properties
        hhat[n] ~ dnorm(h[n], 1.0/(hhat_sigma[n]*hhat_sigma[n])) T(-1,1)
        khat[n] ~ dnorm(k[n], 1.0/(khat_sigma[n]*khat_sigma[n])) T(-sqrt(1-hhat[n]*hhat[n]),sqrt(1-hhat[n]*hhat[n]))
    }
        
}
'''

In [7]:
## Load additional JAGS module
pyjags.load_module('glm')
pyjags.load_module('dic')


## See blog post for origination of some of the adapted analysis tools used here and below:
## https://martynplummer.wordpress.com/2016/01/11/pyjags/

num_chains = 4
iterations = 10000


## data list include only variables in the model
model = pyjags.Model(code1, data=dict( Ndata=Ndata, hhat=hhat, khat=khat, 
                                     hhat_sigma=hhat_sigma, khat_sigma=khat_sigma), 
                     chains=num_chains, adapt=1000)

## Code to speed up compute time. This feature might not be 
## well tested in pyjags at this time. 
## threads=4, chains_per_thread=1 

## 500 warmup / burn-in iterations, not used for inference.
model.sample(500, vars=[])

## Run model for desired steps, monitoring hyperparameter variables, and latent variables
## for hierarchical Bayesian model.
## Returns a dictionary with numpy array for each monitored variable.
## Shapes of returned arrays are (... shape of variable ..., iterations, chains).
## samples = model.sample(#iterations per chain here, vars=['e_sigma', 'h', 'k'])
samples = model.sample(iterations, vars=['e_sigma', 'h', 'k'])

## Code to save, open and use pickled dictionary of samples:
## -- Pickle the data --
#with open('ecc_1_test.pkl', 'wb') as handle:
#   pickle.dump(samples, handle)
## -- Retrieve pickled data --
#with open('ecc_1_test.pkl', 'rb') as handle:
#   retrieved_results = pickle.load(handle)


adapting: iterations 4000 of 4000, elapsed 0:00:02, remaining 0:00:00
sampling: iterations 2000 of 2000, elapsed 0:00:01, remaining 0:00:00
sampling: iterations 14816 of 40000, elapsed 0:00:06, remaining 0:00:10
sampling: iterations 40000 of 40000, elapsed 0:00:15, remaining 0:00:00
sampling: iterations 40000 of 40000, elapsed 0:00:15, remaining 0:00:00

In [8]:
## Generalize code for both h and k below. 

#print(samples)
#print(samples.items())

## Print and check the shape of the resultant samples dictionary:
print(samples['e_sigma'].shape)
print(samples['e_sigma'].squeeze(0).shape)
print(samples['h'].shape)
print(samples['k'][0,:,:].shape)
print('-----')


## Update the samples dictionary so that it includes keys for the latent variables
## Also, we will use LaTeX formatting to help make legible plots ahead.  
samples_Nm1 = {}

## adjust the thin varible to only look at every 10th population element by setting it to 10
thin = 1
## Need to enter the number of hyperparameter variables here:
numHyperParams = 1
## Specify the dimension we want for our plot below, for legibility.  
dim = (Ndata/thin)*2 + numHyperParams
print(dim)

for i in np.arange(0,Ndata,thin):
    #hval = 'h'+str(i+1)
    #kval = 'k'+str(i+1)
    #print(hval)
    #print(kval)
    #samples_Nm1({hval: samples['h'][i,:,:]})
    samples_Nm1.update({'$h_{'+str(i+1)+'}$': samples['h'][i,:,:], '$k_{'+str(i+1)+'}$': samples['k'][i,:,:]})
#print(samples_2['h11'].shape)

## Add the hyperparameter marginal posterior back in:
samples_Nm1.update({'$e_{\sigma}$': samples['e_sigma'].squeeze(0)})

## Below, examine the updated and reformatted sample dictionary to include keys for 
## latent variables 
for j, i in samples_Nm1.items():
    print(j)
    print(i)
samples_Nm1['$h_{5}$'][0]


(1, 10000, 4)
(10000, 4)
(25, 10000, 4)
(10000, 4)
-----
51.0
$h_{2}$
[[ 0.10253433  0.1316362   0.12834745  0.11735116]
 [ 0.12571712  0.09899074  0.08417513  0.13720439]
 [ 0.08590829  0.06385894  0.08635107  0.11630583]
 ..., 
 [ 0.10566881  0.10799076  0.06841691  0.09455784]
 [ 0.11957904  0.05727128  0.07206843  0.13560663]
 [ 0.1911698   0.16288726  0.10558219  0.13052772]]
$k_{6}$
[[ 0.23303026  0.26924439  0.21380337  0.2906679 ]
 [ 0.18445061  0.37390697  0.10106424  0.32905733]
 [ 0.19003709  0.14877965  0.26174171  0.28849964]
 ..., 
 [ 0.36418163  0.11552718  0.1608084   0.3521669 ]
 [ 0.36735784  0.28963655  0.19474252  0.35273288]
 [ 0.36448591  0.18834827  0.25173262  0.19722346]]
$k_{17}$
[[ 0.1132385   0.0409769   0.06768613  0.059573  ]
 [ 0.08030623 -0.03360544  0.04774691  0.00832563]
 [ 0.06720722 -0.02875597  0.03532517 -0.04178938]
 ..., 
 [-0.03447416  0.02065731  0.06513859  0.05088001]
 [ 0.10912378  0.01995673  0.1407263   0.05323889]
 [ 0.16707742  0.08094156  0.06137499  0.07051458]]
$h_{3}$
[[ 0.17405641  0.26840097  0.21968182  0.15689839]
 [ 0.16864979  0.24352289  0.24953728  0.1634903 ]
 [ 0.18618217  0.2170887   0.23185155  0.17120043]
 ..., 
 [ 0.15525728  0.14424928  0.09278783  0.20568466]
 [ 0.15517517  0.15696719  0.247345    0.21765746]
 [ 0.20223496  0.18883554  0.09416615  0.22929541]]
$h_{14}$
[[-0.23918146 -0.18303817 -0.26839044 -0.23211915]
 [-0.26591371 -0.2048521  -0.23872307 -0.23539097]
 [-0.2678584  -0.2153669  -0.22872429 -0.2287024 ]
 ..., 
 [-0.22978153 -0.12754568 -0.23791424 -0.20909669]
 [-0.25397965 -0.26965767 -0.21791435 -0.21572071]
 [-0.21307438 -0.20917666 -0.24377526 -0.28601954]]
$k_{16}$
[[-0.15704729 -0.21285828 -0.2792875  -0.33568053]
 [-0.19851954 -0.24618128 -0.06813921 -0.32282945]
 [-0.33869267 -0.27341004 -0.42131133 -0.27862836]
 ..., 
 [-0.32394221 -0.33329035 -0.30832251 -0.30025402]
 [-0.33996423 -0.31562555 -0.33266492 -0.36129289]
 [-0.19289861 -0.31138025 -0.34803809 -0.20897591]]
$k_{14}$
[[ 0.05873532  0.07938517  0.13329342  0.17023252]
 [ 0.19673611  0.08659722  0.09781376  0.16487612]
 [ 0.15982498  0.11329393  0.05781898  0.03486515]
 ..., 
 [ 0.04837453  0.07153843  0.02049678  0.12696195]
 [ 0.10939869  0.05655206  0.14211853  0.10371186]
 [ 0.1393338   0.15321083  0.08802206  0.27093935]]
$h_{15}$
[[-0.248647   -0.20355106 -0.27999825 -0.30852748]
 [-0.2681983  -0.19139966 -0.27394761 -0.33332476]
 [-0.29401504 -0.24498748 -0.31360331 -0.298393  ]
 ..., 
 [-0.29564992 -0.22129287 -0.26453002 -0.32342199]
 [-0.26136311 -0.18540957 -0.25590965 -0.33882338]
 [-0.29064349 -0.21299496 -0.24550678 -0.24780936]]
$h_{17}$
[[ 0.27681624  0.22140712  0.25903889  0.25289196]
 [ 0.20006332  0.26168523  0.2614355   0.24747562]
 [ 0.2475662   0.28597181  0.31264437  0.30396663]
 ..., 
 [ 0.27246827  0.3033416   0.25081706  0.27854991]
 [ 0.26198445  0.26102619  0.2630611   0.25253845]
 [ 0.26393517  0.26819651  0.32365396  0.21159516]]
$h_{5}$
[[-0.42132457 -0.42506736 -0.41293575 -0.42349471]
 [-0.4174287  -0.48847872 -0.39760482 -0.44209047]
 [-0.43357988 -0.3844113  -0.44816125 -0.39484893]
 ..., 
 [-0.44273328 -0.42762864 -0.40107094 -0.37557353]
 [-0.45932545 -0.4168644  -0.40551372 -0.45912751]
 [-0.40324639 -0.39207214 -0.47251432 -0.44099632]]
$k_{24}$
[[-0.11247853 -0.17877214 -0.0441952  -0.10118143]
 [ 0.05770536  0.16745856  0.01351849 -0.13510763]
 [-0.02948574  0.12581539 -0.17060634 -0.06244359]
 ..., 
 [-0.02688861 -0.02926085 -0.02087981  0.08630697]
 [-0.03452451  0.0088232  -0.0307566  -0.04255182]
 [ 0.10210252 -0.07025033  0.01090178 -0.04251981]]
$h_{18}$
[[-0.20254661 -0.30012104 -0.28658767 -0.29041401]
 [-0.22291382 -0.3425444  -0.28245561 -0.33282496]
 [-0.26597152 -0.30639986 -0.31502555 -0.37550725]
 ..., 
 [-0.29887921 -0.36361744 -0.29192435 -0.31673689]
 [-0.3213286  -0.36208169 -0.36709367 -0.33915317]
 [-0.22997489 -0.31545671 -0.3628054  -0.31707667]]
$h_{19}$
[[ 0.54813896  0.44191249  0.56122825  0.5931594 ]
 [ 0.48166982  0.46923748  0.50924184  0.57443678]
 [ 0.57375841  0.55184667  0.5170258   0.49188406]
 ..., 
 [ 0.51563716  0.48197449  0.45073006  0.52756778]
 [ 0.55427003  0.54819195  0.42267995  0.52049938]
 [ 0.54373498  0.53802307  0.44334701  0.59532115]]
$k_{11}$
[[ 0.45903208  0.59959273  0.4920908   0.5542842 ]
 [ 0.49369657  0.51428909  0.51981626  0.55785115]
 [ 0.49635796  0.51434308  0.49401343  0.47294196]
 ..., 
 [ 0.54112159  0.5350111   0.49363681  0.57258073]
 [ 0.47502458  0.45124203  0.48357211  0.59213581]
 [ 0.48884865  0.5037988   0.64439386  0.5230196 ]]
$e_{\sigma}$
[[ 0.27231281  0.27460855  0.30096592  0.35569456]
 [ 0.29018814  0.27451018  0.28086688  0.28457543]
 [ 0.28140964  0.27683394  0.36078086  0.27681129]
 ..., 
 [ 0.30421881  0.30783944  0.29073423  0.27549841]
 [ 0.28466696  0.3143336   0.27629243  0.32818835]
 [ 0.3310659   0.31780605  0.29353376  0.35200946]]
$k_{22}$
[[ 0.27741901  0.48593962  0.51842925  0.34300986]
 [ 0.28356647  0.31694077  0.25073028  0.40486565]
 [ 0.42202879  0.37199805  0.23634014  0.3587491 ]
 ..., 
 [ 0.43976167  0.28024401  0.39450005  0.38517363]
 [ 0.36806015  0.33055594  0.32848628  0.38159496]
 [ 0.43171754  0.32461317  0.38180117  0.38625675]]
$k_{7}$
[[ 0.02908034  0.03057287  0.11583855  0.26341156]
 [ 0.01899485  0.07962978  0.27405487  0.24221648]
 [ 0.25864313  0.07276447  0.18672152 -0.02443264]
 ..., 
 [-0.0232129  -0.01739238  0.13015432  0.07326589]
 [-0.03031688  0.04663735  0.14109588  0.05971437]
 [ 0.26720275  0.06082459  0.09933677  0.05016329]]
$k_{9}$
[[-0.37823653 -0.3624757  -0.3163752  -0.26898394]
 [-0.29611191 -0.3650802  -0.27118442 -0.33106029]
 [-0.19173533 -0.2109869  -0.23509232 -0.2465763 ]
 ..., 
 [-0.25055497 -0.26274    -0.20366962 -0.18988258]
 [-0.21632168 -0.2657394  -0.35785727 -0.22775181]
 [-0.16668516 -0.28209553 -0.34603979 -0.20068915]]
$k_{23}$
[[-0.09556547 -0.12430127 -0.10662902 -0.06609384]
 [ 0.02290032 -0.0367791  -0.14128271  0.02967974]
 [-0.01243783 -0.03130098 -0.1195769   0.06974371]
 ..., 
 [-0.06008524  0.09631759  0.28866154  0.05030435]
 [ 0.06661401  0.07512447  0.06644846  0.00086804]
 [ 0.09535598 -0.07258785  0.05313265  0.0091958 ]]
$h_{16}$
[[-0.09604272 -0.09196164 -0.00842683 -0.18405324]
 [-0.10906376 -0.13967651 -0.08451263 -0.08973438]
 [-0.10905844 -0.13473176 -0.0901575  -0.11209318]
 ..., 
 [-0.1184773  -0.06802127 -0.07812702 -0.11705456]
 [-0.08079048 -0.04827062 -0.09671376 -0.16748767]
 [-0.07935595 -0.08317033 -0.07113498 -0.07999958]]
$h_{24}$
[[-0.01894785 -0.1826179  -0.16322675 -0.15152353]
 [-0.0430728  -0.1789312  -0.12930342 -0.08670984]
 [-0.08605289 -0.16135773 -0.13122109 -0.08768961]
 ..., 
 [-0.08462546 -0.10583369 -0.16658868 -0.13011233]
 [-0.09976659 -0.16373048 -0.16647748 -0.12624938]
 [-0.15015984 -0.1223573  -0.16394904 -0.11730165]]
$k_{4}$
[[ 0.3190131   0.32859854  0.4467142   0.28831327]
 [ 0.33747445  0.35259239  0.44531118  0.18406688]
 [ 0.3629325   0.33057555  0.3729153   0.26869825]
 ..., 
 [ 0.2260627   0.28520954  0.32572111  0.24293985]
 [ 0.37485601  0.36338725  0.37553837  0.18695655]
 [ 0.27684679  0.32057596  0.24494382  0.41940534]]
$h_{10}$
[[ 0.00618154  0.03103633  0.01100625 -0.05465401]
 [-0.05195228 -0.06335895  0.00046497 -0.02537501]
 [-0.00281784 -0.03348831  0.00181756 -0.0392969 ]
 ..., 
 [-0.02666374 -0.02685467 -0.00082448 -0.04317469]
 [ 0.01546982 -0.02600475  0.00470068  0.01722347]
 [-0.05735276 -0.02996491 -0.01998826  0.02498243]]
$h_{1}$
[[-0.43392749 -0.36042316 -0.33177867 -0.36510012]
 [-0.44389595 -0.40643484 -0.34150062 -0.43782681]
 [-0.35573093 -0.39114149 -0.27749127 -0.47353619]
 ..., 
 [-0.35565096 -0.3921794  -0.40419005 -0.36601547]
 [-0.40228442 -0.40555218 -0.38927779 -0.41991806]
 [-0.38294236 -0.3502083  -0.36334184 -0.34413431]]
$h_{11}$
[[-0.23857675 -0.23778336 -0.26966389 -0.25568684]
 [-0.20850643 -0.29166428 -0.14669803 -0.30422694]
 [-0.23513678 -0.26614123 -0.33031949 -0.22620172]
 ..., 
 [-0.26465297 -0.18159283 -0.17828281 -0.22532721]
 [-0.26404886 -0.27442244 -0.1768402  -0.22606414]
 [-0.17690054 -0.21069408 -0.21746889 -0.24720658]]
$k_{20}$
[[ 0.69714515  0.72779973  0.57637098  0.75479551]
 [ 0.64296864  0.66367547  0.686129    0.67481028]
 [ 0.71856117  0.70090792  0.70062603  0.76697055]
 ..., 
 [ 0.68643288  0.51649883  0.58602875  0.80767513]
 [ 0.66376802  0.70316483  0.77922226  0.57367436]
 [ 0.73122803  0.70145528  0.48924143  0.69453434]]
$k_{5}$
[[ 0.15361077  0.19271527  0.13627539  0.31121351]
 [ 0.28381093  0.21468367  0.19801944  0.2090233 ]
 [ 0.30376805  0.11669286  0.22688902  0.21355008]
 ..., 
 [ 0.25384582  0.14951487  0.22427626  0.22447947]
 [ 0.1566958   0.29273959  0.27140189  0.14755382]
 [ 0.26845861  0.1063165   0.12451308  0.23434287]]
$k_{15}$
[[-0.20494445 -0.10996346 -0.14769272 -0.41890064]
 [-0.22297691 -0.12802451 -0.29617569 -0.40450084]
 [-0.2534691  -0.33150148 -0.31175262 -0.4186171 ]
 ..., 
 [-0.3040666  -0.28141651 -0.32800379 -0.40989483]
 [-0.2397997  -0.30801823 -0.26719436 -0.24799601]
 [-0.3222692  -0.29561527 -0.2174066  -0.3197133 ]]
$k_{18}$
[[ 0.24957915  0.36083732  0.35995247  0.28093358]
 [ 0.24663788  0.25357347  0.41701774  0.31904429]
 [ 0.20713315  0.27773086  0.39964508  0.29919364]
 ..., 
 [ 0.35407697  0.35301754  0.39766205  0.21498613]
 [ 0.31972479  0.21034519  0.11420207  0.25276942]
 [ 0.40780487  0.49103954  0.21846591  0.24080432]]
$k_{21}$
[[-0.23455018 -0.17081832 -0.15322578 -0.1459952 ]
 [-0.20213246 -0.28694882 -0.16960363 -0.10372723]
 [-0.13910997 -0.14951148 -0.14970323 -0.0828845 ]
 ..., 
 [-0.10915601 -0.25356195 -0.03189427 -0.32110182]
 [-0.13254574 -0.24494895 -0.00318624 -0.34375124]
 [-0.09277138 -0.26275974 -0.2086154  -0.29731413]]
$h_{21}$
[[ 0.02077506  0.05088119  0.03550912  0.0502578 ]
 [ 0.02036704 -0.02971793 -0.01002009  0.03472803]
 [-0.01462974 -0.0304613  -0.01022341  0.04234507]
 ..., 
 [ 0.01464719 -0.0184058  -0.00727524  0.02534469]
 [ 0.04798855  0.00145904 -0.0072948   0.03660401]
 [ 0.04978044  0.01195652 -0.00778488 -0.04650567]]
$k_{8}$
[[ 0.17725896  0.10108816  0.16407505  0.22432064]
 [ 0.12121403  0.21377993  0.08262528  0.24642792]
 [ 0.11345266  0.10210626  0.06741877  0.07140329]
 ..., 
 [ 0.17952264  0.14317585  0.07376328  0.0641379 ]
 [ 0.12860248  0.24059126  0.26414122  0.23467943]
 [ 0.12702517  0.24544353  0.28509561  0.21070064]]
$h_{12}$
[[-0.30641027 -0.29192548 -0.31515968 -0.35156147]
 [-0.30194013 -0.25596408 -0.2925425  -0.32660741]
 [-0.3572004  -0.23692645 -0.28320056 -0.31712095]
 ..., 
 [-0.27344497 -0.38309543 -0.34748355 -0.30119191]
 [-0.35963153 -0.3645377  -0.27653595 -0.28772374]
 [-0.3730516  -0.3596404  -0.28068527 -0.3732037 ]]
$h_{20}$
[[-0.02898067  0.03033248 -0.07031906  0.01051596]
 [ 0.00691115 -0.01054425  0.02377401  0.00136168]
 [-0.03407013 -0.04216909  0.06425681  0.00947293]
 ..., 
 [ 0.00372665  0.08107162 -0.03053848  0.01131077]
 [-0.03693399  0.07126672 -0.03532737 -0.01740744]
 [-0.01264557  0.04585846 -0.08197675  0.00591404]]
$h_{22}$
[[ 0.16463755  0.23596397  0.28113333  0.2448718 ]
 [ 0.18547313  0.18704497  0.14083742  0.26012745]
 [ 0.18817346  0.20557789  0.13522003  0.18656808]
 ..., 
 [ 0.17239291  0.20234979  0.23257411  0.26512991]
 [ 0.23274707  0.21017112  0.1965782   0.20847929]
 [ 0.23226265  0.25182778  0.23054922  0.20432667]]
$k_{19}$
[[ 0.48709291  0.36137355  0.39131665  0.39760505]
 [ 0.43375969  0.40603449  0.44318344  0.39310735]
 [ 0.42242691  0.32387083  0.49932791  0.45854848]
 ..., 
 [ 0.56212454  0.24969732  0.46442468  0.20363339]
 [ 0.31157636  0.38669167  0.31754801  0.34241728]
 [ 0.37044887  0.38179605  0.41048351  0.37841166]]
$h_{13}$
[[ 0.68627732  0.67468694  0.62031481  0.75738399]
 [ 0.72884298  0.66030002  0.60117847  0.59714941]
 [ 0.61025972  0.65445916  0.71920153  0.59014612]
 ..., 
 [ 0.64375156  0.64930071  0.55278126  0.69297465]
 [ 0.62723928  0.62959789  0.57319209  0.67922017]
 [ 0.60295297  0.70062628  0.75110972  0.67703755]]
$h_{9}$
[[ 0.1706529   0.06834184  0.10566429  0.12231446]
 [ 0.07442097  0.07179273  0.15848152  0.05914294]
 [ 0.08686676  0.08320505  0.06854244  0.06644106]
 ..., 
 [ 0.09475064  0.16797799  0.14103022  0.13026247]
 [ 0.10681727  0.13945468  0.17918786  0.15431858]
 [ 0.12530821  0.09130089  0.10591271  0.06127957]]
$k_{1}$
[[ 0.00798379  0.0464023   0.16154184  0.08912485]
 [-0.05895778  0.09585937  0.20050473  0.1045586 ]
 [ 0.13547788  0.00705015 -0.0821143   0.1427024 ]
 ..., 
 [-0.01169892  0.07578258 -0.00880709  0.05345417]
 [-0.03376983  0.03220315  0.07188405 -0.01861939]
 [-0.01018108 -0.00597624  0.07483934  0.0541818 ]]
$k_{3}$
[[-0.20705415  0.01691531 -0.17296967  0.09303191]
 [-0.07683352 -0.05502635 -0.04194658  0.02642885]
 [ 0.00161309  0.04627165 -0.05949258 -0.10679546]
 ..., 
 [ 0.08718432  0.05264875 -0.05732215  0.1140721 ]
 [-0.04841919  0.097622   -0.00376658 -0.11146556]
 [-0.10677712 -0.14036003 -0.01404606 -0.11832461]]
$k_{2}$
[[-0.04291374 -0.08614395 -0.14311599 -0.11823014]
 [-0.05440967 -0.05496997 -0.06814378 -0.07784853]
 [-0.08235781 -0.15974494 -0.12836838 -0.06393865]
 ..., 
 [-0.13962733 -0.00592796 -0.10786642 -0.10992779]
 [-0.03821152 -0.22643015 -0.02661489 -0.09515018]
 [-0.21792493 -0.1232164  -0.02611845 -0.14336463]]
$h_{4}$
[[-0.19522391 -0.26996699 -0.24170288 -0.19762816]
 [-0.19618384 -0.17124256 -0.21185465 -0.16330457]
 [-0.16476949 -0.23369938 -0.1307667  -0.17393894]
 ..., 
 [-0.17393706 -0.14951175 -0.15813743 -0.27281972]
 [-0.24563443 -0.25313831 -0.22336323 -0.15590587]
 [-0.19324727 -0.29200805 -0.15000893 -0.23404444]]
$k_{12}$
[[-0.19861949 -0.34213007 -0.26904745 -0.27995604]
 [-0.23200289 -0.34061766 -0.28919576 -0.18409209]
 [-0.25421366 -0.22774688 -0.17506338 -0.2104209 ]
 ..., 
 [-0.3171142  -0.22349777 -0.21887264 -0.12506775]
 [-0.21526239 -0.16955147 -0.2732008  -0.45311794]
 [-0.34099888 -0.25820966 -0.26942459 -0.4717691 ]]
$h_{6}$
[[ 0.21965767  0.27167112  0.23783675  0.24356409]
 [ 0.26900074  0.30174549  0.22510528  0.31893123]
 [ 0.25727198  0.25981854  0.24551639  0.32510027]
 ..., 
 [ 0.30446217  0.25871941  0.15833559  0.24066773]
 [ 0.31928799  0.27169418  0.15541243  0.28298856]
 [ 0.32169683  0.26431915  0.20093323  0.288507  ]]
$k_{10}$
[[ 0.56905954  0.56452191  0.52739064  0.5680424 ]
 [ 0.43203014  0.51724307  0.54637952  0.48357463]
 [ 0.44905927  0.58066247  0.53676197  0.5294714 ]
 ..., 
 [ 0.3873457   0.61514641  0.52730451  0.48917035]
 [ 0.3611093   0.50542388  0.60799392  0.54492814]
 [ 0.52480691  0.50918268  0.53749714  0.60504648]]
$h_{7}$
[[-0.45905701 -0.44702272 -0.43484864 -0.47773912]
 [-0.42722119 -0.44109209 -0.4646183  -0.44503887]
 [-0.42843385 -0.40485266 -0.42680273 -0.44605909]
 ..., 
 [-0.50183483 -0.47001435 -0.38917403 -0.49064473]
 [-0.50471497 -0.44245652 -0.40747394 -0.41497557]
 [-0.48367365 -0.42738523 -0.45428856 -0.43620736]]
$h_{25}$
[[-0.18960185 -0.19462173 -0.19520401 -0.19090792]
 [-0.18313993 -0.29241219 -0.22343094 -0.20284277]
 [-0.20891228 -0.17307068 -0.16306298 -0.22969376]
 ..., 
 [-0.25717737 -0.19797981 -0.21032814 -0.15006394]
 [-0.21239995 -0.22330943 -0.19545229 -0.25412614]
 [-0.21782172 -0.15708998 -0.20929209 -0.22838731]]
$h_{8}$
[[ 0.11286973  0.11950177  0.16337019  0.15014143]
 [ 0.12688794  0.14178592  0.16757743  0.10383956]
 [ 0.10488546  0.16613976  0.09537226  0.13268603]
 ..., 
 [ 0.12539806  0.0712067   0.04720891  0.16350817]
 [ 0.16523929  0.07584116  0.03618663  0.1178749 ]
 [ 0.15930661  0.14043097  0.03188495  0.08749737]]
$h_{23}$
[[-0.46710647 -0.51970596 -0.49272791 -0.45206307]
 [-0.48011784 -0.47011147 -0.4898788  -0.53290223]
 [-0.5302506  -0.52320303 -0.46862129 -0.48960025]
 ..., 
 [-0.48329418 -0.52274659 -0.50646299 -0.41509498]
 [-0.50388936 -0.56019853 -0.49789776 -0.43022313]
 [-0.48217529 -0.46739109 -0.58925796 -0.42325372]]
$k_{25}$
[[ 0.12396676  0.09305769  0.03380553  0.00687376]
 [ 0.00751637  0.06384425  0.03330232 -0.00169812]
 [ 0.02447725  0.20660632  0.11385637  0.03443448]
 ..., 
 [ 0.00967125 -0.03036387 -0.01488517 -0.08881702]
 [ 0.07109357  0.03997608  0.00909975 -0.01900344]
 [ 0.04772112  0.13132127 -0.0078522   0.18836477]]
$k_{13}$
[[-0.20443693 -0.10952879 -0.16522201 -0.25748605]
 [-0.24487568 -0.0890936  -0.22032562 -0.21404832]
 [-0.20267586 -0.23315341 -0.23383229 -0.02539006]
 ..., 
 [-0.04924359 -0.21324351 -0.20760676 -0.19369812]
 [-0.03142913 -0.20265101 -0.21787207 -0.16837275]
 [-0.06933916 -0.0851134  -0.27333439 -0.135783  ]]
Out[8]:
array([-0.42132457, -0.42506736, -0.41293575, -0.42349471])

Below is code to look at summary statistics of the marginal posterior distributions (the probabilistic parameter estimates) for the hyperparameter and the latent variables (each population constituent), in this case h and k (a.k.a projected eccentricity here), of the exoplanet systems we are simulating).


In [9]:
## equal tailed 95% credible intervals, and posterior distribution means:
def summary(samples, varname, p=95):
    values = samples[varname][0]
    ci = np.percentile(values, [100-p, p])
    print('{:<6} mean = {:>5.1f}, {}% credible interval [{:>4.1f} {:>4.1f}]'.format(
      varname, np.mean(values), p, *ci))

for varname in samples_Nm1:
    summary(samples_Nm1, varname)


$h_{2}$ mean =   0.1, 95% credible interval [ 0.1  0.1]
$k_{6}$ mean =   0.3, 95% credible interval [ 0.2  0.3]
$k_{17}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$h_{3}$ mean =   0.2, 95% credible interval [ 0.2  0.3]
$h_{14}$ mean =  -0.2, 95% credible interval [-0.3 -0.2]
$k_{16}$ mean =  -0.2, 95% credible interval [-0.3 -0.2]
$k_{14}$ mean =   0.1, 95% credible interval [ 0.1  0.2]
$h_{15}$ mean =  -0.3, 95% credible interval [-0.3 -0.2]
$h_{17}$ mean =   0.3, 95% credible interval [ 0.2  0.3]
$h_{5}$ mean =  -0.4, 95% credible interval [-0.4 -0.4]
$k_{24}$ mean =  -0.1, 95% credible interval [-0.2 -0.1]
$h_{18}$ mean =  -0.3, 95% credible interval [-0.3 -0.2]
$h_{19}$ mean =   0.5, 95% credible interval [ 0.5  0.6]
$k_{11}$ mean =   0.5, 95% credible interval [ 0.5  0.6]
$e_{\sigma}$ mean =   0.3, 95% credible interval [ 0.3  0.3]
$k_{22}$ mean =   0.4, 95% credible interval [ 0.3  0.5]
$k_{7}$ mean =   0.1, 95% credible interval [ 0.0  0.2]
$k_{9}$ mean =  -0.3, 95% credible interval [-0.4 -0.3]
$k_{23}$ mean =  -0.1, 95% credible interval [-0.1 -0.1]
$h_{16}$ mean =  -0.1, 95% credible interval [-0.2 -0.0]
$h_{24}$ mean =  -0.1, 95% credible interval [-0.2 -0.0]
$k_{4}$ mean =   0.3, 95% credible interval [ 0.3  0.4]
$h_{10}$ mean =  -0.0, 95% credible interval [-0.0  0.0]
$h_{1}$ mean =  -0.4, 95% credible interval [-0.4 -0.3]
$h_{11}$ mean =  -0.3, 95% credible interval [-0.3 -0.2]
$k_{20}$ mean =   0.7, 95% credible interval [ 0.6  0.8]
$k_{5}$ mean =   0.2, 95% credible interval [ 0.1  0.3]
$k_{15}$ mean =  -0.2, 95% credible interval [-0.4 -0.1]
$k_{18}$ mean =   0.3, 95% credible interval [ 0.3  0.4]
$k_{21}$ mean =  -0.2, 95% credible interval [-0.2 -0.1]
$h_{21}$ mean =   0.0, 95% credible interval [ 0.0  0.1]
$k_{8}$ mean =   0.2, 95% credible interval [ 0.1  0.2]
$h_{12}$ mean =  -0.3, 95% credible interval [-0.3 -0.3]
$h_{20}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$h_{22}$ mean =   0.2, 95% credible interval [ 0.2  0.3]
$k_{19}$ mean =   0.4, 95% credible interval [ 0.4  0.5]
$h_{13}$ mean =   0.7, 95% credible interval [ 0.6  0.7]
$h_{9}$ mean =   0.1, 95% credible interval [ 0.1  0.2]
$k_{1}$ mean =   0.1, 95% credible interval [ 0.0  0.2]
$k_{3}$ mean =  -0.1, 95% credible interval [-0.2  0.1]
$k_{2}$ mean =  -0.1, 95% credible interval [-0.1 -0.0]
$h_{4}$ mean =  -0.2, 95% credible interval [-0.3 -0.2]
$k_{12}$ mean =  -0.3, 95% credible interval [-0.3 -0.2]
$h_{6}$ mean =   0.2, 95% credible interval [ 0.2  0.3]
$k_{10}$ mean =   0.6, 95% credible interval [ 0.5  0.6]
$h_{7}$ mean =  -0.5, 95% credible interval [-0.5 -0.4]
$h_{25}$ mean =  -0.2, 95% credible interval [-0.2 -0.2]
$h_{8}$ mean =   0.1, 95% credible interval [ 0.1  0.2]
$h_{23}$ mean =  -0.5, 95% credible interval [-0.5 -0.5]
$k_{25}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$k_{13}$ mean =  -0.2, 95% credible interval [-0.2 -0.1]

In [10]:
## Use pandas three dimensional Panel to represent the trace:

trace = pd.Panel({k: v for k, v in samples_Nm1.items()})
trace.axes[0].name = 'Variable'
trace.axes[1].name = 'Iteration'
trace.axes[2].name = 'Chain'
 
## Point estimates:
print(trace.to_frame().mean())
 
## Bayesian equal-tailed 95% credible intervals:
print(trace.to_frame().quantile([0.05, 0.95]))
  ## ^ entering the values here could be a good question part
    
def plot(trace, var):
    fig, axes = plt.subplots(1, 3, figsize=(9, 3))
    fig.suptitle(var, y=0.95, fontsize='xx-large')
 
    ## Marginal posterior density estimate:
    trace[var].plot.density(ax=axes[0])
    axes[0].set_xlabel('Parameter value')
    axes[0].locator_params(tight=True)
 
    ## Autocorrelation for each chain:
    axes[1].set_xlim(0, 100)
    for chain in trace[var].columns:
        autocorrelation_plot(trace[var,:,chain], axes[1], label=chain)
 
    ## Trace plot:
    axes[2].set_ylabel('Parameter value')
    trace[var].plot(ax=axes[2])
 
    ## Save figure
    filename = var.replace("\\", "") 
    filename = filename.replace("$", "") 
    filename = filename.replace("}", "") 
    filename = filename.replace("{", "") 
    plt.tight_layout(pad=3)
    fig.savefig('Nm1_JAGS_h_and_k_'+'{}.png'.format(filename))
 
# Display diagnostic plots
for var in trace:
    plot(trace, var)


Variable
$e_{\sigma}$    0.309311
$h_{10}$       -0.019250
$h_{11}$       -0.222060
$h_{12}$       -0.321545
$h_{13}$        0.650822
$h_{14}$       -0.220601
$h_{15}$       -0.280473
$h_{16}$       -0.100235
$h_{17}$        0.259326
$h_{18}$       -0.309588
$h_{19}$        0.525893
$h_{1}$        -0.385865
$h_{20}$       -0.006464
$h_{21}$        0.029371
$h_{22}$        0.198489
$h_{23}$       -0.496816
$h_{24}$       -0.134250
$h_{25}$       -0.220330
$h_{2}$         0.102328
$h_{3}$         0.182095
$h_{4}$        -0.203770
$h_{5}$        -0.423044
$h_{6}$         0.270204
$h_{7}$        -0.456033
$h_{8}$         0.113666
$h_{9}$         0.104413
$k_{10}$        0.551398
$k_{11}$        0.499614
$k_{12}$       -0.245080
$k_{13}$       -0.149712
$k_{14}$        0.115249
$k_{15}$       -0.277455
$k_{16}$       -0.264313
$k_{17}$        0.074255
$k_{18}$        0.277206
$k_{19}$        0.392612
$k_{1}$         0.073636
$k_{20}$        0.692894
$k_{21}$       -0.137406
$k_{22}$        0.365926
$k_{23}$       -0.001388
$k_{24}$       -0.011519
$k_{25}$        0.048367
$k_{2}$        -0.085199
$k_{3}$        -0.036414
$k_{4}$         0.309809
$k_{5}$         0.194818
$k_{6}$         0.252900
$k_{7}$         0.115387
$k_{8}$         0.170109
$k_{9}$        -0.274807
dtype: float64
Variable  $e_{\sigma}$  $h_{10}$  $h_{11}$  $h_{12}$  $h_{13}$  $h_{14}$  \
0.05          0.257223 -0.084444 -0.287166 -0.386818  0.585504  -0.28578   
0.95          0.374496  0.045641 -0.156304 -0.256285  0.716314  -0.15494   

Variable  $h_{15}$  $h_{16}$  $h_{17}$  $h_{18}$    ...     $k_{24}$  \
0.05     -0.346281 -0.166220  0.193598 -0.374563    ...    -0.138929   
0.95     -0.214384 -0.034864  0.325130 -0.244318    ...     0.114949   

Variable  $k_{25}$   $k_{2}$   $k_{3}$   $k_{4}$   $k_{5}$   $k_{6}$  \
0.05     -0.078134 -0.213116 -0.163465  0.181404  0.067388  0.125537   
0.95      0.174514  0.043156  0.091719  0.436556  0.320919  0.380345   

Variable   $k_{7}$   $k_{8}$   $k_{9}$  
0.05     -0.011534  0.043072 -0.402388  
0.95      0.241318  0.298412 -0.146360  

[2 rows x 51 columns]
//anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py:516: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
  max_open_warning, RuntimeWarning)

In [11]:
## Scatter matrix plot:

## Redefine the trace so that we only vizualize every 10th latent variable element in 
## the scatter_matrix plot below. Vizualizing all 50 is too cumbersome for the scatter
## matrix. 

samples_Nm1_for_scatter_matrix = {}
Nsamp=25
## adjust the thin varible to only look at every 10th population element by setting it to 10
thin = 10
numHyperParams = 1
dim = (Nsamp/thin)*2 + numHyperParams
print(dim)

for i in np.arange(0,Nsamp,thin):
    samples_Nm1_for_scatter_matrix.update({'$h_{'+str(i+1)+'}$': samples['h'][i,:,:], '$k_{'+str(i+1)+'}$': samples['k'][i,:,:]})
samples_Nm1_for_scatter_matrix.update({'$e_{\sigma}$': samples['e_sigma'].squeeze(0)})

for j, i in samples_Nm1_for_scatter_matrix.items():
    print(j)
#    print(i)

trace_2 = pd.Panel({k: v for k, v in samples_Nm1_for_scatter_matrix.items()})

sm = scatter_matrix(trace_2.to_frame(),  color="darkturquoise", alpha=0.2, figsize=(dim*2, dim*2), diagonal='hist',hist_kwds={'bins':25,'histtype':'step', 'edgecolor':'r','linewidth':2})
## y labels size
[plt.setp(item.yaxis.get_label(), 'size', 20) for item in sm.ravel()]
## x labels size 
[plt.setp(item.xaxis.get_label(), 'size', 20) for item in sm.ravel()]
## Change label rotation
## This is helpful for very long labels
#[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.xaxis.label.set_rotation(0) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
## May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.5,0.5) for s in sm.reshape(-1)]
## Hide all ticks
#[s.set_xticks(()) for s in sm.reshape(-1)]
#[s.set_yticks(()) for s in sm.reshape(-1)]

plt.savefig('scatter_matrix_Nm1_h_and_k_JAGS.png')


6.0
$k_{21}$
$h_{21}$
$k_{1}$
$k_{11}$
$e_{\sigma}$
$h_{1}$
$h_{11}$

2. Generalize to a two-component Gaussian Mixture Population Model


In [12]:
## Below we designate the population values of our two-component truncated Gaussian 
## generative model. These are the truths that we should recover if our hiararchical 
## Bayesian model is properly specified and diagnostics have indicated that the simulation 
## has "not not converged".
Ndata = 25
Nm = 2
frac = [0.7,0.3]
sigmae = [0.05,0.3]

## After generating values from the population model, we now add realistic Gaussian noise 
## to create simulated measurements. 
sigmahobs = 0.04
sigmakobs = 0.08

h = np.repeat(0.,Ndata)
k = np.repeat(0.,Ndata)
hhat = np.repeat(0.,Ndata)
khat = np.repeat(0.,Ndata)
hhat_sigma  = np.repeat(sigmahobs,Ndata)
khat_sigma  = np.repeat(sigmakobs,Ndata)

#print(khat_sigma)

for i in range(0,Ndata):
    #print('i')
    #print(i)
    
    c = np.random.choice(len(frac), 1, p=frac, replace=True)
    #print(int(c))
    h[i] = rnorm_bound(1,0,sigmae[int(c)],lower_bound=-1,upper_bound=1)
    # Euler's formula: h^2 + k^2 = 1
    lb = -np.sqrt(1-h[i]**2)
    ub = np.sqrt(1-h[i]**2)
    k[i] = rnorm_bound(1,0,sigmae[int(c)],lower_bound=lb,upper_bound=ub) 
    hhat[i] = rnorm_bound(1,h[i],sigmahobs,lower_bound=-1,upper_bound=1)
    khat[i] = rnorm_bound(1,k[i],sigmakobs,lower_bound=lb,upper_bound=ub)

    
print(h, hhat, k, khat)


[  1.90728522e-01   4.16351416e-01  -4.51553983e-02   5.61052483e-02
   2.63332074e-01   3.45277147e-02  -1.77512105e-01  -4.27587281e-02
   5.04500406e-01   6.14193203e-02   6.71405485e-02  -5.22859539e-01
  -1.07109599e-02  -5.95944018e-02   8.75761340e-05   1.01034911e-01
   3.20705629e-02  -6.23424973e-02  -3.81486190e-02  -2.65185925e-02
   4.25208642e-01  -4.48275367e-02   9.30256752e-02   8.85250264e-03
  -1.94923663e-04  -7.90552908e-01  -4.58059711e-01  -2.13661150e-02
  -7.85621709e-02   1.33864735e-02  -4.19399240e-01   1.85757101e-02
   4.53592639e-01  -1.13448907e-01  -4.73125057e-02   5.68917277e-02
  -9.31369756e-02   8.16995605e-02   1.73420906e-02  -6.34705892e-02
  -6.82137670e-01   8.49280384e-02  -4.77055277e-02  -4.11854297e-02
   4.27478367e-01  -6.10022550e-02   4.82659004e-02  -1.59836982e-01
  -3.74781480e-02  -3.10298944e-02] [  2.14625050e-01   3.27705311e-01  -5.31186294e-02   7.85126872e-02
   2.55690422e-01   2.21872003e-02  -1.33668075e-01  -5.88149759e-02
   4.18007357e-01   3.01289425e-02   8.88827364e-02  -5.34054697e-01
  -2.80109979e-02  -5.23712421e-02  -9.68520832e-03   1.16860374e-01
   2.68027140e-03  -7.79407612e-02  -9.66522068e-03   2.06462926e-02
   4.62577720e-01  -4.20619894e-02   8.01583185e-02   4.94844424e-02
  -3.24934299e-04  -8.35225381e-01  -4.46697018e-01  -3.59097462e-02
  -2.79091570e-02  -5.12425385e-03  -4.89867514e-01   8.07571890e-03
   4.25862699e-01  -3.23649862e-02  -6.21307299e-02   2.00495861e-02
  -1.34959106e-01   1.05652218e-01   1.56694593e-02  -2.79840550e-02
  -7.12367189e-01   5.17704521e-02  -2.56740328e-02  -5.38058620e-02
   4.32750284e-01  -8.86916148e-02   1.01115673e-01  -1.84102934e-01
  -5.24947839e-02  -1.54634273e-02] [  2.81220053e-01  -2.73201238e-01   2.92726249e-02  -1.86039880e-02
  -1.61704954e-01  -3.58097477e-02  -3.24322699e-01  -5.54358874e-04
   1.04102917e-01   4.24978925e-02   3.16115615e-02   3.67462475e-01
  -1.86103177e-02  -1.36839738e-02   5.85913637e-01   3.09497519e-01
   4.87641077e-02   1.52274516e-02   9.11976024e-02  -6.58680522e-02
   6.78091773e-02   5.94469957e-03   2.57244029e-02  -5.76038030e-02
   4.74908955e-02   3.62652153e-01   2.09367245e-02   6.62419228e-02
   3.99819674e-02  -7.36931804e-02   6.34293514e-02  -1.17652660e-02
  -2.04110488e-01   1.34485403e-01  -3.72288217e-02  -2.08382883e-02
  -5.75957932e-02  -5.74836261e-02  -7.49107359e-02   6.59689809e-02
   5.28989514e-01  -5.42946899e-03   6.61218442e-02   4.13660344e-02
  -2.39065946e-01  -6.22125858e-02  -1.08265974e-01   2.64032023e-02
  -2.97388814e-02   1.74774021e-02] [ 0.35474429 -0.29393934  0.05357464 -0.09689841 -0.2643456  -0.02183307
 -0.31702687 -0.0215985   0.08561801 -0.02199545  0.04663495  0.47282415
 -0.0800195   0.0912828   0.71713963  0.33039424  0.0571146   0.06652926
  0.01060671 -0.13062964  0.02084277  0.16564136 -0.13069599  0.11036314
  0.080813    0.17364551  0.02622056 -0.02873852  0.01116926  0.03948277
 -0.07201533 -0.02048223 -0.08322821  0.16077942  0.07032995  0.12174131
 -0.19412389  0.01279173 -0.04461026  0.05834899  0.4843436   0.02553563
  0.09099832  0.09863167 -0.1732664  -0.08883511 -0.10825843  0.08867988
 -0.18067078  0.01662737]

Below is new JAGS model code for a two-component truncated Gaussian Mixture model:


In [13]:
# JAGS model code

code = '''

model {
        
    #Population parameters
    for (j in 1:Nm) {
        e_sigma[j] ~ dunif(0.0, 1.0)
        e_phi[j] <- 1/(e_sigma[j]*e_sigma[j])
        a[j] <- 1;
    }

    f ~ ddirch(a[])
        
    for (n in 1:Ndata){
    
        #True planet properties
        c[n] ~ dcat(f[]) 
        h[n] ~ dnorm(0, e_phi[c[n]]) T(-1,1) #Can try multivariate truncated normal in future
        k[n] ~ dnorm(0, e_phi[c[n]]) T(-sqrt(1-h[n]*h[n]),sqrt(1-h[n]*h[n]))
            
        #Observed planet properties
        hhat[n] ~ dnorm(h[n], 1.0/(hhat_sigma[n]*hhat_sigma[n])) T(-1,1)
        khat[n] ~ dnorm(k[n], 1.0/(khat_sigma[n]*khat_sigma[n])) T(-sqrt(1-hhat[n]*hhat[n]),sqrt(1-hhat[n]*hhat[n]))
    }
        
}
'''

Below, the number of iterations has been increased by a factor of 10. This may slow down your laptop computer even more if it hasn't already crashed. It depends on your laptop specs. The number if iterations may need to me increased even more to reach signs of having "not not" converged.


In [14]:
# Load additional JAGS module
pyjags.load_module('glm')
pyjags.load_module('dic')

#data list include only variables in the model
model = pyjags.Model(code, data=dict(Nm=Nm, Ndata=Ndata, hhat=hhat, khat=khat, 
                                     hhat_sigma=hhat_sigma, khat_sigma=khat_sigma), 
                     chains=4, adapt=1000)
# threads=4, chains_per_thread=1 
# 500 warmup / burn-in iterations, not used for inference.
model.sample(500, vars=[])

## Run model for desired steps, monitoring hyperparameter variables.
## Returns a dictionary with numpy array for each monitored variable.
## Shapes of returned arrays are (... shape of variable ..., iterations, chains).
## 
iters = 100000
samples2 = model.sample(iters, vars=['e_sigma', 'h', 'k', 'c', 'f'])

# Pickle the data
#with open('ecc_1_test.pkl', 'wb') as handle:
#    pickle.dump(samples, handle)
    
# Retrieve pickled data
# with open('ecc_1_test.pkl', 'rb') as handle:
#      retrieved_results = pickle.load(handle)


adapting: iterations 4000 of 4000, elapsed 0:00:04, remaining 0:00:00
sampling: iterations 2000 of 2000, elapsed 0:00:02, remaining 0:00:00
sampling: iterations 9216 of 400000, elapsed 0:00:10, remaining 0:07:04
sampling: iterations 13828 of 400000, elapsed 0:00:16, remaining 0:07:22
sampling: iterations 18192 of 400000, elapsed 0:00:22, remaining 0:07:39
sampling: iterations 22352 of 400000, elapsed 0:00:27, remaining 0:07:39
sampling: iterations 30592 of 400000, elapsed 0:00:37, remaining 0:07:32
sampling: iterations 34672 of 400000, elapsed 0:00:45, remaining 0:07:54
sampling: iterations 38520 of 400000, elapsed 0:00:52, remaining 0:08:08
sampling: iterations 42224 of 400000, elapsed 0:00:57, remaining 0:08:05
sampling: iterations 49616 of 400000, elapsed 0:01:07, remaining 0:07:52
sampling: iterations 57044 of 400000, elapsed 0:01:16, remaining 0:07:38
sampling: iterations 64556 of 400000, elapsed 0:01:26, remaining 0:07:26
sampling: iterations 72084 of 400000, elapsed 0:01:35, remaining 0:07:11
sampling: iterations 79696 of 400000, elapsed 0:01:45, remaining 0:07:02
sampling: iterations 83488 of 400000, elapsed 0:01:51, remaining 0:06:59
sampling: iterations 91052 of 400000, elapsed 0:02:00, remaining 0:06:48
sampling: iterations 98652 of 400000, elapsed 0:02:08, remaining 0:06:32
sampling: iterations 106364 of 400000, elapsed 0:02:16, remaining 0:06:16
sampling: iterations 114176 of 400000, elapsed 0:02:25, remaining 0:06:03
sampling: iterations 122056 of 400000, elapsed 0:02:34, remaining 0:05:51
sampling: iterations 129996 of 400000, elapsed 0:02:43, remaining 0:05:38
sampling: iterations 137992 of 400000, elapsed 0:02:52, remaining 0:05:26
sampling: iterations 146044 of 400000, elapsed 0:03:01, remaining 0:05:15
sampling: iterations 154116 of 400000, elapsed 0:03:10, remaining 0:05:03
sampling: iterations 162248 of 400000, elapsed 0:03:18, remaining 0:04:51
sampling: iterations 170436 of 400000, elapsed 0:03:27, remaining 0:04:39
sampling: iterations 178668 of 400000, elapsed 0:03:36, remaining 0:04:28
sampling: iterations 186944 of 400000, elapsed 0:03:45, remaining 0:04:16
sampling: iterations 195264 of 400000, elapsed 0:03:54, remaining 0:04:05
sampling: iterations 203612 of 400000, elapsed 0:04:03, remaining 0:03:55
sampling: iterations 211988 of 400000, elapsed 0:04:12, remaining 0:03:44
sampling: iterations 220404 of 400000, elapsed 0:04:21, remaining 0:03:33
sampling: iterations 228852 of 400000, elapsed 0:04:30, remaining 0:03:22
sampling: iterations 237324 of 400000, elapsed 0:04:39, remaining 0:03:11
sampling: iterations 245824 of 400000, elapsed 0:04:49, remaining 0:03:01
sampling: iterations 254348 of 400000, elapsed 0:04:58, remaining 0:02:51
sampling: iterations 262892 of 400000, elapsed 0:05:07, remaining 0:02:40
sampling: iterations 271456 of 400000, elapsed 0:05:16, remaining 0:02:30
sampling: iterations 280036 of 400000, elapsed 0:05:26, remaining 0:02:19
sampling: iterations 288640 of 400000, elapsed 0:05:35, remaining 0:02:09
sampling: iterations 297264 of 400000, elapsed 0:05:44, remaining 0:01:59
sampling: iterations 305900 of 400000, elapsed 0:05:54, remaining 0:01:49
sampling: iterations 314544 of 400000, elapsed 0:06:03, remaining 0:01:39
sampling: iterations 323204 of 400000, elapsed 0:06:13, remaining 0:01:29
sampling: iterations 331880 of 400000, elapsed 0:06:22, remaining 0:01:18
sampling: iterations 340572 of 400000, elapsed 0:06:31, remaining 0:01:08
sampling: iterations 349276 of 400000, elapsed 0:06:41, remaining 0:00:58
sampling: iterations 357992 of 400000, elapsed 0:06:50, remaining 0:00:48
sampling: iterations 366716 of 400000, elapsed 0:07:00, remaining 0:00:38
sampling: iterations 375456 of 400000, elapsed 0:07:09, remaining 0:00:28
sampling: iterations 384208 of 400000, elapsed 0:07:18, remaining 0:00:18
sampling: iterations 392972 of 400000, elapsed 0:07:28, remaining 0:00:08
sampling: iterations 400000 of 400000, elapsed 0:07:35, remaining 0:00:00
sampling: iterations 400000 of 400000, elapsed 0:07:35, remaining 0:00:00

Sort the high and low mixture components to accomidate the exchangeability of parameters. (Future work: reprocess the sorted variables to obtain Gelman-Rubin statistics).


In [15]:
chain_thin = 100
start = int(iters-1000)
esigma_low = np.where(samples2['e_sigma'][0,start::,:] <= samples2['e_sigma'][1,start::,:], samples2['e_sigma'][0,start::,:], samples2['e_sigma'][1,start::,:])
esigma_hi = np.where(samples2['e_sigma'][0,start::,:] > samples2['e_sigma'][1,start::,:], samples2['e_sigma'][0,start::,:], samples2['e_sigma'][1,start::,:])
f_low = np.where(samples2['e_sigma'][0,start::,:] <= samples2['e_sigma'][1,start::,:], samples2['f'][0,start::,:], samples2['f'][1,start::,:])
f_hi = np.where(samples2['e_sigma'][0,start::,:] > samples2['e_sigma'][1,start::,:], samples2['f'][0,start::,:], samples2['f'][1,start::,:])
print(np.min(f_hi))
plt.hist(f_low)


0.115304477818
Out[15]:
([array([   8.,   16.,   65.,  160.,  230.,  268.,  169.,   72.,    9.,    3.]),
  array([   3.,   15.,   58.,  154.,  274.,  271.,  162.,   56.,    5.,    2.]),
  array([   2.,   22.,   60.,  166.,  221.,  265.,  179.,   71.,   13.,    1.]),
  array([   7.,   13.,   58.,  144.,  233.,  272.,  190.,   68.,   13.,    2.])],
 array([ 0.34798515,  0.40165619,  0.45532722,  0.50899826,  0.5626693 ,
         0.61634034,  0.67001137,  0.72368241,  0.77735345,  0.83102449,
         0.88469552]),
 <a list of 4 Lists of Patches objects>)

In [18]:
iters = 100000

#print(samples)
#print(samples.items())

print(samples2['h'].shape)
print(samples2['k'][0,start::,:].shape)
print('-----')
samples_Nm2 = {}
Nsamp=25
thin = 1
numHyperParams = 4
dim = (Nsamp/thin)*2 + numHyperParams
print(dim)
for i in np.arange(0,Nsamp,thin):
    samples_Nm2.update({'$h_{'+str(i+1)+'}$': samples2['h'][i,start::,:], '$k_{'+str(i+1)+'}$': samples2['k'][i,start::,:]})
samples_Nm2.update({'$e_{\sigma_{low}}$': esigma_low, '$e_{\sigma_{high}}$': esigma_hi })
samples_Nm2.update({'$f_{low}$': f_low,'$f_{high}$': f_hi })

for j, i in samples_Nm2.items():
    print(j)
    #print(i)


(50, 100000, 4)
(1000, 4)
-----
54.0
$h_{2}$
$k_{6}$
$k_{17}$
$h_{3}$
$h_{14}$
$k_{16}$
$k_{14}$
$h_{15}$
$h_{17}$
$h_{5}$
$k_{24}$
$f_{low}$
$h_{18}$
$h_{19}$
$k_{11}$
$k_{22}$
$k_{7}$
$k_{9}$
$k_{23}$
$h_{16}$
$h_{24}$
$k_{4}$
$h_{10}$
$h_{1}$
$h_{11}$
$k_{20}$
$f_{high}$
$k_{5}$
$k_{15}$
$k_{18}$
$k_{21}$
$h_{21}$
$k_{8}$
$h_{12}$
$e_{\sigma_{low}}$
$h_{20}$
$h_{22}$
$k_{19}$
$h_{13}$
$h_{9}$
$k_{1}$
$k_{3}$
$k_{2}$
$h_{4}$
$k_{12}$
$h_{6}$
$k_{10}$
$h_{7}$
$e_{\sigma_{high}}$
$h_{25}$
$h_{8}$
$h_{23}$
$k_{25}$
$k_{13}$

In [19]:
## equal tailed 95% credible intervals, and posterior distribution means:
def summary(samples, varname, p=95):
    values = samples[varname][0]
    ci = np.percentile(values, [100-p, p])
    print('{:<6} mean = {:>5.1f}, {}% credible interval [{:>4.1f} {:>4.1f}]'.format(
      varname, np.mean(values), p, *ci))

for varname in samples_Nm2:
    summary(samples_Nm2, varname)


$h_{2}$ mean =   0.3, 95% credible interval [ 0.3  0.4]
$k_{6}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$k_{17}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$h_{3}$ mean =  -0.0, 95% credible interval [-0.0 -0.0]
$h_{14}$ mean =  -0.0, 95% credible interval [-0.1 -0.0]
$k_{16}$ mean =   0.3, 95% credible interval [ 0.3  0.4]
$k_{14}$ mean =   0.0, 95% credible interval [ 0.0  0.1]
$h_{15}$ mean =   0.0, 95% credible interval [-0.1  0.1]
$h_{17}$ mean =   0.0, 95% credible interval [-0.0  0.0]
$h_{5}$ mean =   0.3, 95% credible interval [ 0.2  0.3]
$k_{24}$ mean =   0.0, 95% credible interval [ 0.0  0.1]
$f_{low}$ mean =   0.6, 95% credible interval [ 0.5  0.6]
$h_{18}$ mean =  -0.1, 95% credible interval [-0.1 -0.0]
$h_{19}$ mean =  -0.0, 95% credible interval [-0.0  0.0]
$k_{11}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$k_{22}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$k_{7}$ mean =  -0.3, 95% credible interval [-0.4 -0.2]
$k_{9}$ mean =   0.2, 95% credible interval [ 0.0  0.3]
$k_{23}$ mean =   0.0, 95% credible interval [-0.0  0.0]
$h_{16}$ mean =   0.1, 95% credible interval [ 0.1  0.2]
$h_{24}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$k_{4}$ mean =  -0.0, 95% credible interval [-0.1 -0.0]
$h_{10}$ mean =   0.0, 95% credible interval [ 0.0  0.0]
$h_{1}$ mean =   0.2, 95% credible interval [ 0.2  0.3]
$h_{11}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$k_{20}$ mean =  -0.0, 95% credible interval [-0.0  0.0]
$f_{high}$ mean =   0.4, 95% credible interval [ 0.4  0.5]
$k_{5}$ mean =  -0.3, 95% credible interval [-0.4 -0.2]
$k_{15}$ mean =   0.7, 95% credible interval [ 0.7  0.7]
$k_{18}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$k_{21}$ mean =  -0.0, 95% credible interval [-0.1  0.1]
$h_{21}$ mean =   0.5, 95% credible interval [ 0.4  0.5]
$k_{8}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$h_{12}$ mean =  -0.5, 95% credible interval [-0.6 -0.5]
$e_{\sigma_{low}}$ mean =   0.0, 95% credible interval [ 0.0  0.0]
$h_{20}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$h_{22}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$k_{19}$ mean =   0.0, 95% credible interval [-0.0  0.0]
$h_{13}$ mean =   0.0, 95% credible interval [-0.0  0.0]
$h_{9}$ mean =   0.4, 95% credible interval [ 0.4  0.4]
$k_{1}$ mean =   0.3, 95% credible interval [ 0.2  0.5]
$k_{3}$ mean =   0.0, 95% credible interval [ 0.0  0.0]
$k_{2}$ mean =  -0.2, 95% credible interval [-0.3 -0.2]
$h_{4}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$k_{12}$ mean =   0.5, 95% credible interval [ 0.4  0.6]
$h_{6}$ mean =  -0.0, 95% credible interval [-0.0  0.0]
$k_{10}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$h_{7}$ mean =  -0.2, 95% credible interval [-0.2 -0.2]
$e_{\sigma_{high}}$ mean =   0.5, 95% credible interval [ 0.4  0.6]
$h_{25}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$h_{8}$ mean =  -0.0, 95% credible interval [-0.1  0.0]
$h_{23}$ mean =   0.1, 95% credible interval [ 0.0  0.1]
$k_{25}$ mean =   0.0, 95% credible interval [-0.0  0.1]
$k_{13}$ mean =   0.0, 95% credible interval [-0.1  0.1]

In [20]:
## Use pandas three dimensional Panel to represent the trace:

trace = pd.Panel({k: v for k, v in samples_Nm2.items()})
trace.axes[0].name = 'Variable'
trace.axes[1].name = 'Iteration'
trace.axes[2].name = 'Chain'
 
## Point estimates:
print(trace.to_frame().mean())
 
## Bayesian equal-tailed 95% credible intervals:
print(trace.to_frame().quantile([0.05, 0.95]))
  ## ^ entering the values here could be a good question part
    
def plot(trace, var):
    fig, axes = plt.subplots(1, 3, figsize=(9, 3))
    fig.suptitle(var, y=0.95, fontsize='xx-large')
 
    ## Marginal posterior density estimate:
    trace[var].plot.density(ax=axes[0])
    axes[0].set_xlabel('Parameter value')
    axes[0].locator_params(tight=True)
 
    ## Autocorrelation for each chain:
    axes[1].set_xlim(0, 100)
    for chain in trace[var].columns:
        autocorrelation_plot(trace[var,:,chain], axes[1], label=chain)
 
    ## Trace plot:
    axes[2].set_ylabel('Parameter value')
    trace[var].plot(ax=axes[2])
 
    ## Save figure
    filename = var.replace("\\", "") 
    filename = filename.replace("/", "") 
    filename = filename.replace("$", "") 
    filename = filename.replace("}", "") 
    filename = filename.replace("{", "") 
    plt.tight_layout(pad=3)
    fig.savefig('Nm2_h_and_k_JAGS_'+'{}.png'.format(filename))

    
## Display diagnostic plots
for var in trace:
    plot(trace, var)


Variable
$e_{\sigma_{high}}$    0.394055
$e_{\sigma_{low}}$     0.034870
$f_{high}$             0.382286
$f_{low}$              0.617714
$h_{10}$               0.012632
$h_{11}$               0.040999
$h_{12}$              -0.526457
$h_{13}$              -0.011911
$h_{14}$              -0.023212
$h_{15}$              -0.008286
$h_{16}$               0.114204
$h_{17}$               0.000881
$h_{18}$              -0.036525
$h_{19}$              -0.003920
$h_{1}$                0.212685
$h_{20}$               0.009672
$h_{21}$               0.456744
$h_{22}$              -0.021009
$h_{23}$               0.041090
$h_{24}$               0.022594
$h_{25}$              -0.000515
$h_{2}$                0.323615
$h_{3}$               -0.023640
$h_{4}$                0.037087
$h_{5}$                0.252743
$h_{6}$                0.009221
$h_{7}$               -0.132533
$h_{8}$               -0.024786
$h_{9}$                0.414369
$k_{10}$              -0.002743
$k_{11}$               0.011031
$k_{12}$               0.454464
$k_{13}$              -0.015276
$k_{14}$               0.019837
$k_{15}$               0.686415
$k_{16}$               0.313135
$k_{17}$               0.012194
$k_{18}$               0.016499
$k_{19}$               0.003093
$k_{1}$                0.339861
$k_{20}$              -0.027547
$k_{21}$               0.021523
$k_{22}$               0.049887
$k_{23}$              -0.037745
$k_{24}$               0.022755
$k_{25}$               0.016079
$k_{2}$               -0.281855
$k_{3}$                0.013191
$k_{4}$               -0.024614
$k_{5}$               -0.252638
$k_{6}$               -0.003799
$k_{7}$               -0.301939
$k_{8}$               -0.003573
$k_{9}$                0.082121
dtype: float64
Variable  $e_{\sigma_{high}}$  $e_{\sigma_{low}}$  $f_{high}$  $f_{low}$  \
0.05                 0.297459            0.017210    0.260717   0.490231   
0.95                 0.536675            0.054735    0.509769   0.739283   

Variable  $h_{10}$  $h_{11}$  $h_{12}$  $h_{13}$  $h_{14}$  $h_{15}$  \
0.05     -0.029940 -0.004900 -0.590941 -0.057288 -0.074396 -0.074367   
0.95      0.057732  0.099934 -0.461023  0.030553  0.021749  0.058293   

Variable    ...     $k_{24}$  $k_{25}$   $k_{2}$   $k_{3}$   $k_{4}$  \
0.05        ...    -0.033687 -0.036122 -0.413240 -0.041508 -0.107950   
0.95        ...     0.097132  0.079361 -0.152603  0.076090  0.033822   

Variable   $k_{5}$   $k_{6}$   $k_{7}$   $k_{8}$   $k_{9}$  
0.05     -0.377970 -0.060073 -0.433376 -0.060037 -0.049758  
0.95     -0.124117  0.048988 -0.168269  0.051862  0.206946  

[2 rows x 54 columns]

In [21]:
## Scatter matrix plot:

## Redefine the trace so that we only vizualize every 10th latent variable element in 
## the scatter_matrix plot below. Vizualizing all 50 is too cumbersome for the scatter
## matrix. 

samples_Nm2_for_scatter_matrix = {}

## adjust the thin varible to only look at every 10th population element by setting it to 10
thin = 10
numHyperParams = 4
dim = (Ndata/thin)*2 + numHyperParams
print(dim)

for i in np.arange(0,Ndata,thin):
    samples_Nm2_for_scatter_matrix.update({'$h_{'+str(i+1)+'}$': samples2['h'][i,start::,:], '$k_{'+str(i+1)+'}$': samples2['k'][i,start::,:]})
samples_Nm2_for_scatter_matrix.update({'$e_{\sigma_{low}}$': esigma_low, '$e_{\sigma_{high}}$': esigma_hi })
samples_Nm2_for_scatter_matrix.update({'$f_{low}$': f_low,'$f_{high}$': f_hi })

for j, i in samples_Nm2_for_scatter_matrix.items():
    print(j)
#    print(i)

trace_3 = pd.Panel({k: v for k, v in samples_Nm2_for_scatter_matrix.items()})

sm = scatter_matrix(trace_3.to_frame(),  color="darkturquoise", alpha=0.2, figsize=(dim*2, dim*2), diagonal='hist',hist_kwds={'bins':25,'histtype':'step', 'edgecolor':'r','linewidth':2})
## y labels size
[plt.setp(item.yaxis.get_label(), 'size', 20) for item in sm.ravel()]
## x labels size 
[plt.setp(item.xaxis.get_label(), 'size', 20) for item in sm.ravel()]
## Change label rotation
## This is helpful for very long labels
#[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.xaxis.label.set_rotation(0) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
## May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.5,0.5) for s in sm.reshape(-1)]
## Hide all ticks
#[s.set_xticks(()) for s in sm.reshape(-1)]
#[s.set_yticks(()) for s in sm.reshape(-1)]

plt.savefig('scatter_matrix_Nm2_h_and_k_JAGS.png')


14.0
$k_{21}$
$h_{21}$
$e_{\sigma_{low}}$
$h_{31}$
$h_{1}$
$h_{11}$
$f_{low}$
$h_{41}$
$f_{high}$
$k_{1}$
$k_{11}$
$k_{31}$
$k_{41}$
$e_{\sigma_{high}}$

In [ ]: